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NONCLASS ICAL  CONTROL  PROBLEMS  AND  STACKELBERG  GAMES 

G.  P.  Papavassilopoulos  and  J.  B.  Cruz,  Jr. 


Abstract:  A nonclassical  control  problem,  where  the  control  depends  on 

state  and  time,  and  its  partial  derivative  with  respect  to  the  state  appears 
in  the  state  equation  and  in  the  cost  function  is  analyzed.  Stackelberg 
dynamic  games  which  lead  to  such  nonclassical  control  problems  are  considered 
and  studied. 

Key  Words:  Stackelberg  games,  nonclassical  optimal  control,  variational 

methods . 


t: 

■ i 


The  authors  are  with  Decision,  and  Control  Laboratory,  Coordinated  Science 
Laboratory,  University  of  Illinois,  Urbana,  Illinois  61801. 

This  work  was  supported  in  part  by  the  National  Science  Foundation  under  Grant 
ENG- 74-20091 , in  part  by  the  Department  of  Energy,  Electric  Energy  Systems 
Division  under  Contract  U.S.  ERDA  EX-76-C-01-2088 , and  in  part  by  the  Joint 
Services  Electronics  Program  under  Contract  DAAB-07-72-C-0259. 


I 


J 


i 

I 


MAIN  ERRATA 


P.  16  line  14 


x1(xrx1  (t)) 

u (x >t)  - e *u(t)  + [u1(t)-xl(t)u(t)J[x1-x1(t)J 

P.  26,  27,  33  Where  m,  read 
P.  27.  relation  75 


R.  ■ Y.I,  y -..-V  - Y>0,  i»l,...,n. 


P.  29  line  11 


Instead  of:  "(h(x(t),t))."  read  "(x(t),t)". 
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Hierarchical  and  large  scale  systems  have  received  considerable 
attention  during  the  last  few  years;  firstly  because  of  their  importance  in 
engineering,  economics  and  other  areas,  and  secondly  because  of  the  increased 
capability  of  computer  facilities  [13], [14].  An  important  characteristic  of 
many  large  scale  systems  is  the  presence  of  many  decision  makers  with  different 
and  usually  conflicting  goals.  The  existence  of  many  decision  makers  who 
interact  through  the  system  and  have  different  goals  may  be  an  inherent 
property  of  the  system  under  consideration  (e.x.,  a market  situation),  or 
may  be  simply  the  result  of  modeling  the  system  as  such  (e.x.,  a large 
system  decomposed  to  subsystems  for  calculation  purposes).  Differential 
games  are  useful  in  modeling  and  studying  dynamic  systems  where  more  than 
one  decision  maker  is  involved.  Most  of  the  questions  posed  in  the  area  of 
the  classical  control  problem  may  be  considered  in  a game  situation,  but 
their  resolution  is  generally  more  difficult.  In  addition,  many  questions 
can  be  posed  in  a game  framework,  which  are  meaningless  or  trivial  in  a 
classical  control  problem  framework.  The  superior  conceptual  wealth  of  game 
over  control  problems,  which  makes  them  potentially  much  more  applicable, 
counterbalances  the  additional  difficulties  encountered  in  their  solution. 

A particular  class  of  games  are  the  so-called  Stackelberg 
differential  games  [l]-[8].  Stackelberg  games  provide  a natural  formalism 
for  describing  systems  which  operate  on  many  different  levels  with  a corre- 
sponding hierarchy  of  decisions.  The  mathematical  definition  of  a general 
two-level  Stackelberg  game  is  as  follows.  Let  U,  V be  two  sets  and  , J0 


l 


l 

li 


two  real  valued  functions 


3 


Jt  : Ux  V - R,  i - 1,2. 


We  consider  Che  sec  valued  mapping  I 


T : U - V,  um  TuCv 


defined  by 


Tu  • [v|  v * arg  inf[J?(u,v);  v€V]}.  (3) 

Clearly  Tu  * 0 if  the  inf  in  definition  (3)  is  not  achieved.  We  also  consider 
the  minimization  problem 

inf  JL(u,v)  (4) 

subject  to:  u£  U,  vg  Tu, 


where  we  use  the  usual  convention  J^(u,v)  * if  vg  Tu  ■ 0. 

•/f  -jlf 

Definition:  A pair  (u  ,v  )€UxV  is  called  a Stackelberg  equilibrium  pair  if 

• fc  ie 

(u  ,v  ) solves  (4). 

The  sets  U and  V are  called  the  leader's  and  follower's  strategy  spaces 
respectively.  The  game  situation  described  by  the  mathematical  formulation 
above  is  as  follows.  The  follower  tries  to  minimize  his  cost  function  J,, 
for  a given  choice  of  u£  U by  the  leader.  The  leader  knowing  the  follower's 

rationale,  wishes  to  announce  a u such  that  the  follower's  reaction  v to 

* * * 
this  given  u will  result  to  the  minimum  possible  J^(u  ,v  ).  The  general  N- 

level  Stackelberg  game  is  defined  analogously.  Stackelberg  differential 

games  were  first  introduced  and  studied  in  the  engineering  literature  in  [2] 

and  further  studied  in  [3] -[8].  They  are  mathematically  formalized  as 

follows 


x(t)  - f(x(t),u(t),v(t),t),  X(t  )"X 

o o 

c.f 

J.  (u,v)  - + , L.  (x(t),u(t)  ,v(t)  ,t)dt, 

Co 

where  f,  g^  are  appropriately  defined  functions. 

U,  V are  appropriately  defined  function  spaces  and  u(t),  v(t)  are  the  values 
of  u and  v respectively  at  time  t,  i.e.,  u(t)  ■ u|c»  v(t)  ■ v|t-  The  type 
of  strategy  spaces  U and  V which  were  considered  and  created  successfully  in 
the  previous  literature  where  the  spaces  of  piecewise  continuous 
functions  of  time.  In  this  case,  the  problem  of  deriving  necessary  condi- 
tions for  the  Stackelberg  differential  game  with  fixed  time  interval  and 
initial  condition  x^,  falls  within  the  area  of  classical  control.  Thus, 
variational  techniques  can  be  used  in  a straightforward  manner.  The  case 
where  the  strategy  spaces  are  spaces  of  functions  whose  values  at  instant  t 
depend  on  the  current  state  x(t)  and  time  t,  i.e.,  u(t)  ■ U| t = u(x(t),t), 
v(t)  ■ vj c ■ v(x(t).t),  was  not  treated.  This  case  results  in  a nonclassical 
control  problem  because  ^-appears  in  the  follower's  necessary  conditions. 

Since  the  follower's  necessary  conditions  are  seen  as  state  differential 

5u 

equations  by  the  leader,  the  presence  of  ^ in  them  makes  the  leader  face  a 
nonclass'cal  control  problem. 

In  the  present  paper,  the  nonclassical  control  problem  arising 
from  the  consideration  of  the  above  strategy  spaces  is  embedded  in  a more 
general  class  of  nonclassical  control  problems,  see  (6),  (7).  The 
characteristics  of  this  general  class  of  problems  are  the  following: 

(i)  each  of  the  components  u*',  of  the  control  m-vector  u,  depends  on  the 
current  time  t and  on  a given  function  of  the  current  state  and  time.  i.e. 


(5) 


i “ 1,2 


Also,  ucU,  v c V,  where 


r ""  — 

5 

u“,t  • u 1 (.h V <x  1.C)  ,c)  ,C) ; (ii)  Che  state  equation  and  the  cost  functional 
depend  on  the  first  order  partial  derivative  of  u with  respect  to  the  state 
x.  The  vector  valued  functions  l/  may  represent  outputs  or  measurements 
available  to  the  l-th  "subcontroller , " in  a decentralised  control  setting. 

The  only  restriction  to  be  imposed  on  h*  is  to  be  twice  continuously 
differentiable  with  respect  to  x.  This  allows  for  a quite  large  class  of 
h 's  which  can  model  output  feedback  or  open  loop  control  laws.  It  can  also 
model  mixed  cases  of  open  loop  and  output  feedback  control  laws  where  during 
only  certain  intervals  of  time  an  output  is  available.  The  appearance  of 
the  partial  derivative  of  u with  respect  to  x prohibits  the  restriction  of 
the  admissible  controls  to  those  which  are  functions  of  time  only.  It  will 
become  clear  that  the  extension  of  our  results  to  the  case  where  higher  order 
partial  derivatives  of  u with  respect  to  x,  up  to  order  N.  appear  is  straight- 
forward. This  case  is  of  interest  in  hierarchical  systems  since  it  arises, 
for  example,  in  an  N-level  Stackelberg  game  where  the  players  use  control 
values  dependent  on  the  current  state  and  time.  Although  the  bulk  of  the 
analysis  provided  in  this  paper  concerns  continuous  time  problems,  the 
corresponding  discrete  time  results  can  be  derived  in  a very  similar  manner. 

The  structure  of  the  present  paper  is  as  follows:  In  Section  1,  a 
nonclassical  control  problem  central  to  the  whole  development  is  defined  and 
studied.  In  Section  2,  a two-level  Stackelberg  differential  game  is  treated 

for  a fixed  time  interval  [t  , tJ  and  initial  condition  x(t  ) ■ x . The 

o t o o 

leader's  and  follower's  strategies  are  functions  of  the  current  state  and 
time,  and  the  results  of  Section  1 are  used  for  deriving  necessary  conditions 

j i. 

|ti 
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for  this  game.  Certain  interpretations  of  the  results  are  also  given. 
Section  3.  a Linear  Quadratic  Stackelberg  game  is  solved  as  a specific 
aoolication  of  the  theorv  of  Section  Finaliv.  ve  have  a conclusions 


section. 


Notation  and  Abbreviat ions 
.n 


R : n-dimensional  real  Euclidean  space  with  the  Euclidean  metric 

: denotes  the  Euclidean  metric  for  vectors  and  the  sup  norm  for  matrices 


denotes  transposition  for  vectors  and  matrices 


For  a function  f : Rn  - R^  we  say  that  f £ C*  if  f has  continuous 


mixed  partial  derivatives  of  order  k.  For  f : R - R,  ?f  is  considered  as 


n in 

n \ 1 column  vector  and  f denotes  the  Hessian  of  f.  For  f : R‘  - R , rf  is 


xx 


n < ^ 

considered  as  n :<  m matrix  v Jacobian . For  f : R \ R - R , where  x - R‘  , 


y€  R'  , f(x,y)S  R , we  denote  by  or  f or  ™ f the  Jacobian  matrix  of  the  par 


t ia . 


derivatives  of  f with  respect  to  x and  is  considered  as  n \ m matrix. 

w.r.  to:  with  respect  to 

w.L.o.g.:  without  loss  of  generality 

n.b.d. : neighborhood. 


1.  A Nonclass ical  Problem 

Consider  the  ivnamic  svstem  described  bv 


11  1 - mm 

X<0  * t(x(C),u  (h  (x(t),t),t)  ,u  (h  (x(t>,C),t> u'  (h'  ^X(t),t3,t3, 


ux(h\x(c)  ,c),c) u*(h"a(x<.t},t\:),t) 


io'' 


1 


and  the  functional 


• 11 

J(u)  - g(x(tf))-^  L(x(t),u‘(h  (x(t)  ,t)  ,t) um(hm(x(»')  ,t)  ,t)  , 

11  Til  HT 

u^(h  (x(t),t),t) u^(h  (x(t),t),t),t)dt  (?) 

, . , , n+nv+ron-H  n n+nrHnn+1  n , i „ n+1  q i 

where  the  functions  f:R  — R , L : R -R  , h R — R 

n X 

i*l,...,tn,  g :R  — R are  continuous  in  all  arguments  and  in  C*  with  respect  to 

. i , , i n+1  qi  , . „2 

the  x,  u,  u^.  The  functions  h :R  — R , are  continuous,  and  m c w.r.  to  x 

The  time  interval  [t  ,tJ  is  considered  fixed  w.l.o.g.  v,see  [ 10]  > . We  want  to 

O I 

find  a function  u where  r , -> 


i qi  - 

u : R X i t ,t,  - R , i ■ 1 , . . . ,m 

O L 

u^(hX (x,t) , t)  exists  and  u1 (h1 (x, t) , t) , u‘ (h1 (x , t ) , t)  are  continuous  in  x 

and  piecewise  continuous  in  t,  for  x“R  , t:  t t , , i * 1 m so  as  to 

• o»  f 

minimize  J(u).  We  denote  by  U the  set  of  all  such  u's.  Therefore  the 
problem  under  investigation  is 


minimize  J(u) 

subject  to  u£u  and  (6). 


We  will  use  the  notation 


i Jk 

1*1 
! OU  ’ 


mxn  matrix,  L * 


, mx 1 vector 


nxn  vector, 


l-l 


3L 


1 3(uL) 

X 


m 


» -u  , yL  * (yj-, . . . ,yj, . . . ,y*  ) ' c R ‘ 

j ayj  . 

hL  -(h^ hj,...,h^  )',  i-l,...,m,  j - 1 ..... qt 

1 . i i . 

U i - (u  ) 

y L 4i 


,1  , -hLCx,t)  . ^(y^t) 


3x 


3yL  i y1-h1(x,t) 


, nxl  vector  i*l,...,m 


r 1 * • m-, 

u m , u ! nXm  matrix. 

X X • • X - 

This  problem  is  posed  for  a fixed  time  interval  -C0>C£-  and  initial  condition 
x(t^)  ■ xq.  Therefore  the  solution  u*,  if  it  exists,  will  in  general  depend 
on  t^,  tg,  x^,  but  we  do  not  show  this  dependence  explicitly. 

It  should  be  pointed  out  that  the  arguments  used  in  Classical 
Control  Theory  for  showing  that  for  the  fixed  initial  point  case,  it  is 
irrelevant  for  the  optimal  trajectory  and  cost  whether  the  control  value  at 
time  t is  composed  by  using  x(t)  and  t or  only  t,  do  not  apply  here.  If 
U| c * u(t),  t €Ce0,ef],  then  u^  ■ 0 and  this  changes  the  structure  of  problem 
(8).  Consideration  of  variations  of  u^  is  also  needed  and  this  was  where  the 
previous  researchers  stopped,  see  [4].  This  provlem  is  successfully  treated  here 
by  proving  an  extension  (Lemma  1.1)  of  the  so-called  ’’fundamental  lemma"  in 
Che  Calculus  of  Variations  (see  [12]). 

The  following  theorem  provides  necessary  conditions  for  a function 
u:tt  to  be  a solution  to  Che  problem  (8)  in  a local  sense;  (we  assume  chat  U 
is  properly  topologized) . It  is  assumed  in  this  theorem  that  the  optimum  u* 
has  strong  differentiability  properties,  an  assumption  which  will  be  relaxed 
later,  in  Theorem  1.2.  The  proof  of  this  theorem  is  based  on  the  following  lemma 


9 


Lemma  1.1:  Let  M : [t^ , tf]  - Rm,  Nt  : [ tQ , t J- Rn,  i =*  1 m,  y : [t  ,t.}- Rn, 


be  continuous  functions,  such  that 
t. 


m Cf 


J M'(t)cp(y(t),t)dt+  I J N'(t)cp  (y(t),t)dt  = 0 

t 1*1  t y 


*_  *]_**.  ^ „mv  i 


for  every  continuous  function  cp  :R  Xlt^.t^J-R  , where  cp  ■ (cp  , . . . ,cp  ) , and 

cp  is  in  C1  w.r.  to  y.  Then  M,  N,,...,N  are  identically  zero  on  rt  ,t.]. 

I m "of' 

Proof  of  Lemma  1.1:  The  choice  cp^^  = (0, . . . .O.cp^C, ...  ,0) 1 , cp1  : [t^,t^]  -R,  cp1 

continuous  in  t,  i*l,...,ra,  yields  M30  on  [t  ,t_].  Since  M»0,  the 

choice  cp-!  - (0, . . . .y't  ,0,  . . . ,0)  ’ , cp r =*  y ' t , where  V - (t . , . . . ,V  ) ' , 

1 tf  In 

I'  :[to,t^]-Rn,  ¥ continuous  in  t,  results  in  0 (t)H<  (t)dt  = 0,  for  every 

i . . to 

such  f , and  thus  IL  2 0 on  i_  tQ , t ^ J is  proven  in  the  same  way  as  M = 0 was 

proven.  3 

The  conclusion  of  the  above  lemma  holds  even  if  the  restriction 
i,  , kli 

cp  (x.t)  - y1  ...  ,n  - — *•"*—*  — - ~u, 

integers,  since  the  polynomials  are  dense  in  the  space  of  measurable  functions 

on  [t  , t ,1 . 
o f" 

Theorem  1.1:  Let  u*  6 U be  a solution  of  (3)  which  gives  rise  to  a trajectory 
T.  * { (x*(t),t)| t € [t  ,t-]} , such  that  u1  are  in  C*  w.r. to  x in  a n.b.d.  of 

r y 

{ (h1  (x*  (t) , t)  ,t) , t 6 [tQ,  t^] ) . Then  there  exists  a function  p :[t^,t^]-Rn  such 


kni  Xi 

v _ -t  is  imposed,  where  k7< , . . . ,k  are  nonnegative 


p(t)  * 

m ^i  i i 

L +f  p + I £ u V h.(L,+f.p) 
x x i“l  j*l  j xx  j i i 

(10) 

L + f p ■ 0 
u u 

(11) 

V h1 

X 

(Li  +ftP)*  0,  i - 1, .. . ,m 

(12) 

dg(x(tf)) 

P(t£>  ’ 3x 

(13) 

that 
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hold  for  t€'„to,t-],  where  all  the  partial  derivatives  are  evaluated  at 
x*(t),  u1*(hL(x*(t),t),t),  u1’V(h1(x*(t),t),t),t. 


Proof  of  Theorem  1.1:  Let  gs0  w.l.o.g.  (see  [10]).  Consider  a function 
cp€U,  <p  ■ (cp  , . . . ,<pm)  which  has  the  same  continuity  and  differentiability 
properties  as  u*.  Such  a cp  will  be  called  admissible.  Using  the  known 
theorems  on  the  dependence  of  solutions  of  differential  equations  on 
parameters,  we  conclude  that  for  e€R,  e sufficiently  small,  u*  + sc  gives 
rise  to  a trajectory  { (x (e , t ) , t) | t € [ to , t f] 1 , x(0,t)  - x*(t),  and  that 
x(e,t)  is  in  w.r.  to  e.  Direct  calculation  yields 


d (3x.C«  ,t)  » £ + (u  +eCp  )f  + Z (u1  +e<pL  )f.  ]’ 

dt  ds  x ' x x'  u i-iv  xx  xx'  l de 

+ f'tp+  E f'V  hV.  , 

u i»l  i X ^ r de 


t=t 


0. 


We  set 


m qi  i i 

A(t)  = fx+Uxfu+i^  jSiYxxViP 
Bi(t)  - f’ 

B2(t)  " fi7xhi*  i-l.-.-.tn 

,*  „*  ii* 


(14) 

(15) 

(16) 
(17) 
(IS) 


where  A,  B1 , B2  are  evaluated  at  t,  x*,  u*,  u*  and,  thus,  for  c«0,  (14)  can 
be  written  as 

m i i 

j.o+  I r1- 
1 i-1 


*2*  i.  z(t  )a0. 

y 


(19) 


For  fixed  cp  we  consider 


7(e)  - J(U  +ec) . 


i 
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Since  J(e)  is  in  C1  w.r.  to  e and  u*  is  a local  optimum,  it  must  hold 


de  lg*Q 


Direct  calculation  yields 


- j*  CCl  + (u  +ecp  )L  + £ (U1  +6CP1  )L.  ] ' 
dc  “ x x x u i»i  xx  xx  i oe 


+ L J)  + E L.V  hV  Jdt 
u 1*1  1 x y1 

Setting 

m qi  i L 

T(t)  ■ L +u  L + E E u.V  h L. 

x xu  t-i  j»i  j xx  j i 

A.  (t)  « L* 

1 7 u 

4(t)  - L^h1,  i-l,...,m. 

with  T,  A. , A1  evaluated  at  x*,  u*,  u*,  we  conclude  from  (20)- (23)  that 

i 4 X 


J*  Cr«+A  q»+  £ aJcd1  ] 

t 1 L y 


Idt  - 0. 


Therefore  (24)  must  hold  for  every  admissible  cp.  Let  $(t,T)  be  the  transition 
matrix  of  A(t).  Let  also  cp  ( t ) denote  the  vector  (cp1  (h1  (x*  (t ) , t ) , t ) , . . . , 
cpm(hm(x*(t)  ,t)  ,t)) ' and  cp*(t)  the  vector  ft1  (x  * Then  from 


(19)  we  obtain 

„c  _ m j i 

z(t)  - j *(t,T)rBl(T)q»(T)+i£lB2(T)<p  (T)]dT 

t€[to,tf] 

and  substituting  in  (24)  we  obtain 


I 


Jt 


{r(C)J  ?(t,T)[BL(T)V(T)  + i“1B,(T)?i(T)]dT  +^(t)cp(C) 


m i — i 

+ A,  (t)9  (t)]dt  “ 0. 


Let  Xf  .1  denote  the  indicator  function  of  [a,b]c  [t  , t,l.  We  can  inter- 
la, bj  -of 

change  the  order  of  integration  in  (26)  since  the  integrated  quantities  are 

bounded  on  [t^,t,]x  [t^.t^]  (Fubini's  Theorem).  Using  the  fact  Xvc)  * 

° ° [t  ,b] 

X(b)  we  have  successively 


[c,cfJ 


Cf  Cf 


^ rt  IU  -j  i 

J [T(t)«  (t,r)B.  (T)f(T)  +r(t)4(t,T)  I B,(T)cp  (T)] 
t t i-1  l 


Cf  Cf 


• X(t)  drdt  - f [f  r(t)5(t,T)dt]B.  (T)cp(T)  dT  + 

r _ _ 1 * a.  - i- 


[ T > t f ] 

n»  f „ f 


t T 
o 


I r [J  T(t)«  (t,T)dt]B^(T)^1'(T)dT. 
i-r  t t 1 


By  introducing 


p'(T)  - r r(t)5(t,r)dT 


(26)  can  be  written  as 


m ** 


J*  Ip’  (T)B1(t)  + A^t)]:P(t)  + l5l  J'  Ip'(t)bJ(t)  + aJ(t)] 


IpV)  dT  - o. 


Applying  Lemma  1.1  to  (29),  we  obtain 


p'(T)B1(r)  +Al(t)  a 0,  on  [t0,tf] 


W 
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p'(t)b}(t)  +^(T)  s 0 on  CcQ,  tfj  . (31) 

Using  (17),  (13)  and  (22),  (23)  in  (30),  (31)  we  have  equivalently  (11)  and 
(12).  Dif ferenciacion  of  (23)  and  use  of  (16)  and  (21)  give  the  equivalent 
to  (23) 


m 

* p ■ L +f  p + r 
x x*'  i.L 


4i  i_  i 
t u“V  h.  (L.  +f.  p) 

j«l  j XX  j t 


p(tf)  - 0. 


The  assumption  gaQ,  is  removed  in  the  known  way,  resulting  in  (13).  - 

We  give  now  a different  derivation  of  the  results  of  Theorem  1.1, 
under  weaker  assumptions,  which  provides  an  interpretation  for 
them  and  at  the  same  time  an  extension  of  the  region  of  their  validity.  Let 

* [u|u  : Ct  ,tgj  - R^,  u piecewise  continuous] . (32) 


Consider  the  problem 


minimize  JCu.u^, . . . .u^)  - g(x(tf)) L (x,u,^hl  (x , t)^ , . . . »7xhm(x,  t)um, t)dt 


subject  to  x - f(x,u,7  h (x,t)u1,...,U“(x,t)u  ,t),  x(t  )-x  . t€[t  ,tj 

LX  ill  «JU  O l 


u€ET,  u . € U , i«l,...,m. 

m i qt 

Clearly,  if  j£,  are  the  infima  of  (33)  and  (3)  respectively,  it  will  be 

Also,  if  u ■ (u^- , . . . .u31)  ' ,u^ , . . . ,11^  solve  (33)  and  give  rise  to  x(t), 
then  an  u » (u\  . . . .u™)  ' z U with 
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L_ 


iuI(hl(x(c),e),c) 


!um(ha(x(c),c),t) 


u(C),  u^(hi(x(c),t),t:)  »^xhi(x(t))c)ui(t) 


1 ® 1 1 • » » I !D 


results  in  J2(u)  « J(u, u^ , . . . .u^)  and  gives  rise  to  Che  same  x(t) 
such  ueU  does  exist.  For  example  we  set 


(34) 
However , 


u1(hi(x,t),t)  - a^(t)  hl(x,t)  + bt(t) 


(35) 


where 


at(t)  - u.(t) 


bi(t)  - Gl(t)  - a;  (t)h1(x(t),t) 


(36) 

(37) 


i “ 1, . . . ,m 


This  u satisfies  (34).  Thus,  problems  (33)  and  (8)  are  actually  equivalent, 


in  the  sense  that  for  each  given  (xo,Cq)  they  have  Che .same  optimal 


trajectories  and  costs  and  their  optimal  controls  are  related  by  (34). 

The  conditions  of  Theorem  1.1  are  now  directly  verified  to  be  the 


necessary  conditions  for  problem  (33),  where  one  should  use  u and  in 


place  of  u and  u t respectively . More  importantly,  the  conditions  of  Theorem 

y 

'ff  ( if  ? 

1.1  hold  if  one  considers  simply  u €lT,  without  assuming  that  u~.  is  in  C1" 

r i , y1 

w.r.  to  x in  a n.b.d.  of  i.  (h  (x*(t) ,t) ,t) ,t € ( t ,te]j.  This  weakens  the 


strong  differentiability  property  of  u*  assumed  in  Theorem  1.1.  The 


relative  independence  of  u,  u ^ , was  exploited  in  proving  Theorem  1.1 


when  the  special  form  of  Che  perturbation  m(y,t),  y,!,(c)  (see  proof  of 


Lemma  1.1),  sufficed  to  conclude  (11)  and  (12).  This  independence  of  u and 


u was  taken  a priori  into  consideration,  when  problem  (33)  was  formulated. 

y 

Clearly,  even  if  higher  order  partial  derivatives  u w.r.  to 
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x appear  in  f and  L,  or  if  u,u  i are  restricted  to  take  values  within 

y 

certain  closed  sets,  the  equivalence  of  the  corresponding  problems  (S) 
and  (33)  holds  again  (with  appropriate  modifications  of  the  definitions 
of  U,  U f and  L).  We  formalize  the  discussion  above  in  the  following 
theorem. 

Theorem  1.2:  Let  u*€  u be  a solution  to  the  problem 

,tf  , 

minimize  J(u)  - g(x(t  ))  +k  L(x,u,u  , . . . ,um,t)dt  (38) 

t xx 

o 

subject  to:  x « f (x.u.u*, . . . x(to)  - x^,  c€[ tQ, t f] 
ucv,  (u1  (h1  (x  (t) , t) , t) , . . . ,um(hin(x  (t),t),u1^(h1(x(t),t),t)',..., 

y 


um(n(hm(x(t),t),t)')€v 

y 

where  V ^ Rm  + nm  is  closed.  Then  there  exists 

o— 


(39) 


p:  [tQ,tj]  "*  Rn  such  that 

-p  ■ s,+£*p+1i1 


(40) 


L(x*(C),  u1*(hl(x*(t),t),t) um*(hm(x*(t),t),t),  uj*(h1(x*(t),t),t), 

m*  m 

,...,  u x(h  (x*(t),t),t),t)  + 

+ f'(x*(t),  u1^!1  (x*(t),t),t) ura*(hra(x*(t),t),t),  U1*(hl (x*(t),t) ,t) , 


.....  u™* (h®(x* (t ) , t ) , t ) • p(t)  < 

< L(x^(t),  q* q®,  Vxh1(x*(t).t)q1 ^(x*  (t ) , t t) 

+ f(x*(t),q* q®,  Vxh1(x*(t),t)q1,...,7xhra(x*(t),t)qm,t) 


(41) 


v(q*.  • • • ,q"»q{, • • • »<^)€v0* 


p(tf) 


3g(x*(tf)) 


for  t€( tQ,tf]  . c 

It  is  remarkable  that  the  established  equivalence  of  the  problems 
(8)  and  (33)  refers  to  the  optimal  trajectories,  costs  and  control  values. 

It  does  not  refer  to  any  other  properties,  such  as  sensitivity,  for  example. 
It  is  thus  possible,  that  different  realizations  of  u*  (h*  (x,t) ,t)  other 
than  (35)  may  enjoy  sensitivity  or  other  advantages.  The  following  pro- 
position provides  information  for  tackling  such  problems. 

Proposition  1.1. 

(i)  If  u and  v are  elements  of  U,  both  satisfying  (34),  so  does 
\u  + (l-\)u,  Vx€  R. 

(ii)  Let  m*l,  h*(x,t)  = x^  and  x^,u,u^  be  scalarvalue  functions  of 

t,t<=[t  ,tJ  . Then  the  function 
o t 

xl<xl"Cri(t))_ 

u(x,t)  - e ux(t)  + [u(t)-x1(t)u1(t)l  • [xj-x^t)] 


satisfies  u(x(t),t)  * u(t),  ux(x(t),t)  ■ ff^(t) 

(iii)  Let  x,  "u,  "u^  be  as  in  (ii).  Assume  that  the  scalar  valued 
functions  u(x,t),  v(x,t)  satisfy  u(x(t),t)  * v(x(t),t)  * 

”Q(t)  and  u^(S(t),t)  ■ v^(5T(t),t)  «U^(t).  Then  so  do  the 
2 _ 2 + 2 

functions  vuv,^ — , assuming  that  u and  v are  properly 

behaved . - 

The  proof  of  this  proposition  is  a matter  of  straightforward  verification. 
The  assumption  in  parts  (ii)  and  (iii)  for  scalar  valued  quantities 
actually  induces  no  loss  of  conceptual  generality,  since  it  can  be  abandoned 
at  the  expense  of  increased  complexity  of  the  corresponding  expressions 
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of  course. 

The  nonuniqueness  of  the  solution  u to  problem  (3)  is  obvious 
in  the  light  of  (34)  and  Proposition  1.1.  Nonetheless,  this  nonuniqueness 
is  a nonuniqueness  in  the  representation  of  u*-  as  a function  of  h*-  and  t, 
while  uj  , uj  are  the  same  for  all  these  representations.  The  non- 

c y . Ll 

uniqueness  of  u|  ,u  .!  , if  any,  can  be  characterized  in  terms  of  the 

C yL  c 

possible  nonuniqueness  of  the  a^(t),  b^(t)  (see  (35)),  where  one,  v.l.o.g, 

restricts  uL  to  affine  in  'nL  strategies. 

One  very  basic  difference  between  problems  (3)  and  (33)  is  the 

following.  It  is  clear  that  the  principle  of  optimality  holds  for  both  of 

these  problems,  in  the  sense  that  the  last  piece  of  each  optimal  trajectory 

is  optimal.  The  existence  of  a closed  loop  control  law  (u(x,y) , u^ (x, t) , . . . , 

u^x.t)  which  results  in  an  optimal  solution  to  problem  (33)  for  every  initial 

point  (xo>tQ)  in  a subset  of  Rn+^  is  guaranteed  under  certain  assumptions,  see 

[11].  A corresponding  statement  does  not  hold  for  problem  (3),  i.e.  in  general 

there  do  not  exist  functions  u*  of  h't(x,t)  and  t such  that  u * (u^ , . . . , um)  ' is 

an  optimal  solution  to  problem  (3)  for  every  initial  point  (x  , t ) in  a subset 

0 0 

of  Rn+^.  This  can  be  easily  seen  to  hold  by  the  following  argument.  Let 


such  u exist.  Then, 


(u1  (h  1‘ (x > t ) , t ) , . • . ,u‘“(hni(x,t)  ,t)  ,u^  (hl(x,t),t) ' , . . . , ua  (hm (x , t ) , c ) ' ) ' 

y y 

is  a closed  loop  control  law  for  problem  (33).  This  implies  that  there  must 
exist  a solution  (u, u, , . . . .u^)  with  u ■ (u1 , • . . .u®)  of  the  partial  differential 
equation  of  Dynamic  Programming  associated  with  problem  (33)  which  satisfies 
u^x.t)  - uL(h‘(x,t),t)  and  » ~xhl  (x , t) -Uj,  (x,  t) , i-l,...,a,  which 


h (x,t)*u, (x,t),  i - 1, ... ,a,  which 


1 


! 
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is  not  in  general  true.  This  difference  between  problems  (8)  and  (33)  emphasizes 
the  fact  that  their  equivalence  holds  in  a restricted  fashion,  i.e.  for  each 
initial  point  considered  independently  and  not  in  a global  fashion,  like  a 
closed  loop  control  law  treats  the  initial  points. 

Two  final  remarks  before  entering  the  next  section  are 

pertinent  here.  First,  that  the  established  equivalence  of  the  problems 

(S)  and  (33)  reduces  all  questions  of  existence,  uniqueness 

and  of  sufficiency  conditions  for  problem  (8)  to  the  corresponding  ones 

for  (33).  Second,  Theorem  1.2  still  holds  if  instead  of  the  initial  condi- 

0 0 * 3 ' 

tion  x(t  ) * x,,  it  is  given:  x (t  ) *x  and  x (t.)  =*x,,  where  x * (x  ,x  )'. 
oo  oo  r r 


In  thi3  case,  (42)  is  modified  to 


a 3g(x  (t,)) 

? <*f> 7~b 


8(x  ) 

where  the  more  general  cost  functional 


a 3h(X  (t  )) 

and  p (t  ) - a — 

O ^ , 3 , 


3(x-) 


(43) 


J - g(xa(tf))+  h(x2 3  (t0))+u'  L(x,u,t)dt 

t 

o 


(44) 


is  considered  (see  [10]). 

2.  A Stackelberg  Game 

In  this  section  we  introduce  a two- level  Stackelberg  game  and 
show  how  it  leads  us  to  the  consideration  of  a nonclassical  control 
problem.  This  nonclassical  control  problem  falls  into  the  general  class 
considered  in  Section  1.  Using  the  results  of  Section  1,  we  analyze 
the  Stackelberg  game  of  the  present  section. 


t 
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Lee 


a.  m, 

U ■ {u|  u : R°  x[c  , tJ  - R , u(x,c>€r  * for  x«  R n and  t ■=[  t , tJ  , 
of  or 


u^Ox.t)  exists  and  u(x,t),  ^(x,t)  are  continuous  in  x and 
piecewise  continuous  in  t} 


0*5) 


V ■ {vlv:  [t  ,tJ  - X v is  piecewise  continuous  in  t}  . (46) 

O I 


Consider  the  dvnamic  sv3tem 


x(t)  - f(.x(e),  a(t),  r(t),t),  x(e  ) - x , e€[t  ,tj 

o o or 


(47) 


and  the  functionals 


Jx(u,v)  - 3(x(tf))  +w  L(x(t) , u(t),  v(t),t)dt 


(48) 


^(u.v)  - h(x(tf))  -r  " M(x (t) s u(t),  v(t),t) 


(49) 


where  u«  U , vi.  V , x is  the  state  of  the  system,  assumed  to  be  a continuous 

a a„ 

function  of  t,  x:  [ , c f]  - R‘  , and  the  functions  f:  R X R 1 x R “ x 

[to,tf]  - Rn,  g,h  : Rn  - R, 


n al 

L.M  : R X R X R " X [t  .C .]  - R, 

O r 


are  in  C w.r.  to  the  x,u,v  arguments  and  continuous  in  t.  The  u and  v are 
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P 

♦ 

I 


I 

1 


called  strategies  aad  are  chosen  from  U and  V which  are  called  the  strategy 
spaces,  by  the  two  players,  the  leader  and  the  follower  respectively.  With 
the  given  definitions,  for  each  choice  of  u and  v,  the  behavior  of  the  dynamic 
system  is  unambiguously  determined,  assuming  of  course,  that  for  the  selected 
pair  (u,v)  the  solution  of  the  differential  equation  (47)  exists  over  [tQ,tJ. 

ic  ic 

Let  us  assume  that  such  a Stackelberg  equilibrium  pair  (u  ,v  ) 
exists.  For  fixed  u$U,  Tu  is  determined  by  the  minimization  problem 

minimize  J^CujV) 

subject  to:  v(V  (50) 

x ■ f(x,  u(x,t),v,t),  x(tQ)  * xo>  t S [tQ,tf] 

and  thus,  applying  the  Minimum  Principle  we  conclude  that  for  v(V  to  be  in 
Tu,  there  must  exist  a function  p : [tQ,tf]  — Rn  such  that 

x - f(x,u,v,t)  (51-a) 


Mv  + f^  - 0 


(51-b) 


-P 


M + uM  + (f  + u f )p 
X XU  x xu 


x(t0) 


X. 


p(tf) 


3h(x(tf ) ) 


(51-c) 


(51-d) 


We  further  assume  that  U is  properly  topologized.  Conditions  (31)  define  a 
set  valued  mapping  T*  : U - V.  By  using  the  nature  of  the  defined  U and  V 
and  the  fact  that  (51)  are  necessary  but  not  sufficient  conditions  it  is 
easily  proven  that 
(i)  TuCT'u 


I 


W - ' •• 


F- 
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(ii)  J2(u,v' ) 2 J^(u,v)  Y : v'  $ T'u,  v € Tu, 
(iii)  T'u*niu*2  {v*}  f 0. 


Notice  that  J?(u,v)  takes  one  value  for  given  u and  any  v?  Tu,  while 
J2(u,v')>  v'  € T'u  does  not  necessarily  do  so.  We  assume  now  the  following. 
Assumption  (A)  : 


J^(u,v')  2 J^(Ujv)  for  v1  € T’u,  v€Tu,  u€  U^t 


(52) 


★ * 
where  11,  is  a n.b.d.  of  u in  U. 
N 


For  (A)  to  hold  it  suffices  for  example:  T = T'  on  We  conclude 

that  if  (A)  holds,  then  u*  is  a local  minimum  of  the  problem 

minimize  J^(u,v) 

subject  to:  u£  U,  vST'u 

or  equivalently 


minimize  J^(u,v) 
subject  to:  ug  U,  v€V 

x = f(x,u,v,t) 


-p  = M + uM  + (f  +u  f )p 

X XU  XX  U 


Mv  + fvP  - ° 


x(to)  - xo,  P(tf) 


3 h(x(tf ) 


(53) 

(53-a) 

(53-b) 

(53-c) 

(53-d) 


The  problem  (53)  is  a nonclassical  control  problem  of  the  type  considered  in  the 
previous  section,  since  the  partial  derivative  of  the  control  u w.r.  to  x 

appears  in  the  constraints  of  (53)  which  play  the  role  of  the  system  aifferen- 

tlal  equations  and  state  control  constraints,  with  new  state  (x' ,p'). 

^See  Appendix  A. 


- 


■ -a* 


Notice  that  the  leader  uses  only  x(t)  and  t in  evaluating  u(x(t),t)  and  not 
the  whole  sr^'e  (jc'.p')';  t.e.,  the  value  of  u at  time  t is  composed  in  3 


partial  feedback  fora  with  respect  to  Che  state  (x'.p')';  (recall  the  output 
feedback  in  contrast  to  the  state  feedback  control  laws).  In  this  case,  the 
h^'s  for  the  leader  (u) , are 


h ((x.P), 


t)  - I 


0 i ■ x , t-1 , . . . ,m, 

nxn  nxn  ■ ip  ’1 


and  the  hifs  for  the  follower  (v)  are  identically  zero.  Different  h^s  may 
be  used  to  model  different  information  structures  in  terms  of  x(t) , and  t 
available  to  the  leader  and  follower  at  time  t.  If  one  were  concerned  with 
a Stackelberg  game  composed  of  N (i  2)  hierarchical  decision  levels  _7_,  .3', 
then  the  leader  would  face  a nonclass ical  control  problem  where  the  N-th 
partial  of  u with  respect  to  x would  appear. 

We  arrived  at  the  conclusion  that  the  leader  is  faced  with  the  non- 
classical  control  problem  (53).  We  will  assume  chat  the  state-  control  con- 
straint (53-c)  can  be  solved  for  v over  the  whole  domain  of  interest  to  give 


v ■ S(x,p,u,t) 


where  S is  continuous  and  in  C w.r.  to  x and  p.  This  assumption  holds  in  many 
cases,  as  for  example  in  the  linear  Quadratic  case  to  be  considered  in  the  next 
section.  In  any  case,  direct  handling  of  the  constraint  (53-c)  by  appending 
it,  or  assumption  of  its  solvability  in  v,  does  not  seem  to  be  the  core  of 
the  matter  from  a game  point  of  view.  However  Che  following  remark  is 
pertinent  here.  Assume  that  we  allow  v£V, 


V ■ [v|v  : R x [co>t^J  — R v(x,t)  piecewise  continuous 


in  t and  Lipschitzian  in  x,  where  x;  Rn  and  tc  [t0,tf]} 


instead  of  v£V.  The  assumption  of  solvability  of  (53-c)  will  again  give 


I 


v(x,C)  - S(x,p,u,t).  (56) 

Since  v(x,t)  will  be  substituted  In  the  rest  of  (53)  with  S(x,p,u,t)  from 
(5o),the  leader  will  be  faced  with  exactly  the  same  problem  as  after 
substituting  v(t)  with  S from  (54).  Therefore,  no  additional  difficulty 
arises  if  one  alLows  V instead  of  V and  assumes  solvability  of  (53-c). 

In  any  case,  for  either  V or  V,  even  if  (53-c)  is  not  solvable  for  v,  the  leader's 
problem  can  be  treated  by  using  Theorem  1.2,  where  the  control  (u.v)  should  be 
considered  as  unknown  and  (53-c)  will  play  the  role  of  a constraint,  see  (39). 


Substituting  v from  (5-*)  to  (,53)  we  obtain 


minimize  J(,u)  ■ gixie.))  + J Li.x,p,u,t)dt 
u€  U Co 


subject  to: 


F^ix.p.u.t) 


F„,  (x.p.u.t)  + u F,„(x,p,u,t) 

« i X «i4  ^ 


X<,t  ) - X , p(t  ) - 
o o t 


dh(x(Cf)) 


where  L,  F. , F . , F,,,  stand  for  the  resulting  composite  functions. 

Problem  (57)  is  a nonclassical  control  problem  like  the  one  treated  in 
Section  1 where  (x',p')'  is  the  state  of  the  system.  Thus,  Theorem 
1.2  is  applicable  and  can  be  used  for  writing  down  the  leader's  necessary 
conditions.  From  the  results  of  the  previous  section,  we  conclude  that  the 
solution  for  the  leaders  u -if  it  exists  -Is  not  unique.  It  is  interesting 
to  notice  chat  (35)  implies  that  the  leader  has  nothing  to  lose  if  he  commits 
himself  to  an  affine  in  x,  time  varying  strategy.  With  such  a commitment,  the 
leader  does  not  deteriorate  his  cost,  does  not  alter  the  optimal  trajectorv, 
and  also  the  follower's  optimal  cost  is  not  affected.  More  noteworthy  u that 
the  affine  choice  for  the  leader  can  be  made  even  if  L,  M are  nonlinear  and 
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u,  u*  are  constrained  to  take  values  in  given  closed  sets.  In  audition,  v 

nay  be  constrained  to  take  values  in  a given  closed  set  in  which  case  (53-c) 

should  be  substituted  by  an  appropriate  inequality.  In  accordance 

with  the  discussion  in  the  previous  section,  we  have  that  in  general  there 

does  not  exist  a strategy  u(x,t)  which  is  optimal  for  every  initial  point 

(x  ,t  ) in  a subset  of  Rn+^. 
o o 

It  has  been  shown  in  [4]  through  a counterexample  that  the  principle 

of  optimality  does  not  hold  for  Stackelberg  games.  To  make  this  statement  more 

precise  let  us  assume  that  the  problem  has  been  solved  in  [t  r ] and  x*  is  the 

o £ 

optimal  trajectory.  While  the  process  at  (x*(t"),T),  where  tj<t’<c^,  we  stop 

and  solve  the  same  Stackelberg  game  on  [”t,t.]  with  initial  condition  x(tD  *x*Ct). 

Let  x*  be  the  optimal  trajectory  for  the  second  problem.  Then  x*  does  not 

have  to  coincide  with  the  restriction  of  x*  on  [T,  t^J  . The  explanation  is 

the  following.  The  leader  is  faced  with  the  control  problem  (5“)  which  has 

3h(x(tf)) 

boundary  conditions  x(tQ)-xo  and  p(tf)  ■ ^ , given  at  both  and  t.. 

Let  (x*,p*)  be  the  optimal  trajectory  of  this  problem.  If  the  leader  is  asked 

to  solve  the  same  control  problem  on  ["E.t^]  with  boundary  conditions  x(t’)  ■**(?) 

3h(x(tf))  | 

and  o(tf)  * , there  is  no  necessity  for  pd)  -p^Ct-)!  Even  more,  if 

\,,  \.  are  the  adjoint  variables  of  the  leader's  control  problem  on  [t  , tj  and 

\j_>  are  the  adjoint  variables  of  the  leader's  control  problem  on  [t’.tj  , 

-g(x(t.)) 

corresponding  to  x and  p respectivelv,  it  will  be  \,(t,)  - ; 1 — , 

^ 5s(*(c£))  1 £ « ' 

^2^0^  *0,  ^l^cf^  " dynamic  programming  were  holding 

it  should  be  \,(t)  ■ \^(t)  ■ 0,  which  is  not  true.  Actually,  \ , ( ¥)  -0, 

,ft€[t^,tg]  is  a necessary  condition  for  dynamic  programming  to  hold.  The 
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condition  \.,(F)*0,  't~[to,tg]  can  be  used  tor  example  in  the  Linear  quadrat! 
game , see  (67h  - (74.)  for  deriving  more  explicit  conditions  Ln  terms  of  the  data 
of  the  probLera  for  dynamic  programming  to  hold.” 

Let  X “ Cl  ^ ,X  ^ ) ' denote  the  adjoint  variable  for  problem  (5’)  with 
\^>  X,,  corresponding  to  x and  ? respectively.  Then,  condition  (\\)  results 
in 

[Mu(x,u,Six,p,u,t)  ,t)  + fuU,u,SU.P,u,t),t)p]\;,  - 0 (53b 

Yt€[to,«:fl 

which  will  generally  make  the  leader's  problem  singular  '.9j . This  is  to  be 
expected,  because  the  leader  exerts  his  influence  through  the  time  functions 
resulting  from  u and  u^,  which  are  actually  quite  independent,  and  u^  is  not 
penalized  or  subjected  to  any  constraint  in  the  initial  formulation  (47)- 
(*+9).  In  other  words,  the  leader  is  more  powerful  than  what  a first  inspec- 
tion of  the  original  problem  indicates.  One  way  to  restrict  the  leader's 
strength  or  to  avoid  the  singular  problem  could  be  the  inclusion  of  u^  in  L, 
i.e.,  L ■ L(x,u,u\  . . . ,u™,t),  which  would  model  a self  disciplined  leader, 
or  to  impose  a priori  bounds  on  u , for  example,  ’ ] u ^'  • s k,  T tj  [t  .t*] 
which  could  be  interpreted  as  a constitutional  restriction  on  a real  life 
Leader. 


See  Appendix  3. 


3.  A Linear  Quadratic  Stackelberg  Game 


In  the  present  section  we  work  out  a Linear  Quadratic  Stackelberg 
game.  The  leader  is  penalized  for  u^  as  well,  by  including  it  in  L.  We 
consider  the  dynamic  system 

x * Ax  + B^u  + B?v,  x(tQ)  = xq,  t £ [tQ,tf]  (59; 


and  the  cost  functionals 


JL(u,v)  * jtx^KffXf  + J (x'Q1x+ u'R11u+v'R12v+  E u^  R^^dt]  (60) 

to 


j,(u,v)  * -[xjK2fxf  + f (x'Q2x+  u'R21u  + v'R2,v)dt  (61) 

Co 

where  the  matrices  A,  B^,  Q^,  R^,  are  continuous  functions  of  time  and 
Pi’  Rij  ’ Ri  are  symmetric.  R22  is  nonsingular  Y t £ [tQ,tf],  which  guarantees 
(54).  The  follower's  necessary  conditions  are  (recall (51) ) • 


n” ^ o ' ~ 
*22"2P 


x - Ax  + Bj_u  - B2R^2B’p 


p ■ -Q2x  - u^R^u  ■ A'p  - uxb{p 


x(tQ)  - xo,  p(tf)  - ^2fxf * 

Therefore,  the  leader's  problem  is  (recall  (53),  (57)) 


We  assume  that  Assumption  (.A)  holds.  See  also  Appendix  A. 
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L tf 

minimize  J(u)  * ~^xfKifxf  + . (x 1 Q^x  + u ' R^u  + 

Co 

+ p,b2^r12^JbJp+  r u^RiU^dt] 


subject  to: 


X * Ax  - B2R^^B^p  + B^u 


P - -Q2x  - A'p  - uxB|p  - uxR21u 


XUC)  * xo,  P(tf)  - K2fxf. 


The  necessary  conditions  for  the  Leader  in  accordance  with  Theorem  1.2  are 
(67),  (68),  (69)  and 


ruu  + b;xl  - r-iU;\2  - 0 


lRl\  :••••  RmUx]  + X2tR21U+BiP)’  " ° 


•QLx  - A’\1  + Q^\2 

‘B2R22Ri:R22B2P  + + ^2  + VScN 


W * Klfxf*  W " °* 


For  simplification  we  assume  further  that 


Rl  ■ Yj^I.  Yi  > 1 “ 1 m 


Rll  ‘ R22  “ 1 


and  (70),  (71)  are  easily  solved  for  u and  u^  to  yield 


r 
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u 


*21*21! 


[B[X1 


*2LBipl 


(76) 


u 

X 


1 

Y 


(77) 


which  can  be  substituted  into  (67),  (68),  (72),  (73)  to  yieLd  a nonlinear 
system  of  differential  equations,  with  unknown  x,  p,  \^ , and  boundary 
conditions  (69)  and  (74).  If  y - then  (76)  and  (77)  yield  u - 0 and 

u -•  and  thus  the  solution  tends  to  the  open  loop  solution,  i.e., 

u ■ u(t)  v ■ v(t),  as  the  resulting  form  of  (673,  (68),  (72),  (73)  indicates 
for  Y - -hc((2,  ] , [3]). 

Before  ending  this  section,  we  make  the  following  comment.  It 
could  be  suggested  to  the  follower  to  penalize  u*  in  his  criterion  while  u^ 
is  not  penalized  in  the  leader's  criterion.  This  would  lead  to  the  appear- 
ance of  u*  in  (68)  (assuming  u^  exists).  Thus  in  addition  to  (58)  a 

XX  XX 

similar  condition  due  to  u^  appears  which  reinforces  the  singular  character 
of  the  problem.  If  the  leader  now  restricts  himself  to  affine  strategies  in 

x,  then  u*  * 0 and  the  resulting  optimum  is  as  before.  Actually,  the 

leader  can  restrict  himself  to  a quadratic  strategy  in  x (.without  affecting 
his  global  optimum  cost  and  trajectory)  having  thus  three  influences  on  the 
system,  naroly  u,  u^,  u^  , from  which  only  u is  penalized  in  the  leader's 
criterion.  Therefore,  the  leader  will  do  better.  For  the  follower  it  is 


1 

1 

! 


not  obvious  if  he  will  do  better  or  not. 
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4.  Conclusions 

In  the  present  paper,  a none  Lass  leal  control  problem  was  introduced 
and  analyzed.  Problems  of  this  type  arise  in  the  study  of  hierarchical  systems, 
and  take  into  account  several  information  patterns  that  might  be  available  to 
the  controllers.  Two  different  approaches  were  presented.  The  first  uses 
variational  techniques,  while  the  second  reduces  the  nonclassical  problem  to 
a classical  one.  The  nonexistence  of  closed  loop  control  laws  for  this  problem 
was  shown.  The  nonuniqueness  of  the  solution  of  this  problem  was  considered 
and  explained.  The  results  obtained  for  this  nonclassical  control  problem 
were  used  to  study  a Stackelberg  differential  game  where  the  players  have 
current  state  information  only  (h(x(t),t)).  Necessary  conditions  that  the 
optimal  strategies  must  satisfy  were  derived.  The  inapplicability  of  dynamic 
programming  to  Stackelberg  dynamic  games  was  explained.  The  singular  character 
of  the  leader's  problem  was  proven  and  the  nonuniqueness  of  his  strategies  was 
proven  and  characterized.  In  particular,  it  was  shown  that  commitment  of  the 
leader  to  an  affine  time  varying  strategy  does  net  induce  any  change  to  the 
optimal  costs  and  trajectory.  A linear  Quadratic  Stackelberg  game  was  also 
worked  out  as  a specific  application. 

We  end  by  outlining  certain  generalizations  of  the  work  presented 
here.  We  consider  first  the  discrete  time  versions.  Consider  the  dynamic 
system 


1^1 


x^  - f(xk,uJ’(h1‘(xk,k)  ,k) ,. . . ,u!n(h:n(xk,k),k) 


.1^1 


Ux(hi(xTc’k)  ’k) u^Ch^x^.k)  ,k)  , k) 


x given,  k - 1 , . . . ,N-1 
o 


and  the  cost 


I 


J(u)  =»  g(x^)  + - L^\u  (x^,k)  ,lc) u^h^^.k)  ,k)  , 


ux(h  (x^,k)k) , u'x(h‘  (xk>k),k) 


The  proof  of  the  corresponding  Theorem  1.2  is  straightforward.  An  immediate 
consequence  is  that  the  restriction 


ul(hl(xk,k) ,k)  - Akhl(xk.k)  + Bk,  i - 

where  A^.B*  are  matrices,  does  not  induce  any  loss  of  generality  as  far  as 
the  optimal  cost  and  trajectory  are  concerned,  (compare  to  (35)).  Clearly 
Proposition  1.1  carries  over,  too. 

A discrete  time  version  of  the  Stackelberg  game  of  Section  2 can 
be  defined  (see  ),  and  analyzed  similarly  to  section  2.  Several  information 
patterns  can  be  exploited  by  employing  different  h^'s  (see  (8)  ).  The 
restriction  of  the  leader  to  affine  strategies  can  also  be  imposed  in  the 
discrete  case.  The  linear  quadratic  discrete  analog  of  problems  (59)- (61) 
can  also  be  worked  out  in  a similar  way. 

The  case  where  higher  order  partial  derivatives  of  u w.r.  to  x 
appear  in  (6)  and  (7)  can  be  treated,  and  all  the  analysis  of  Section  1 
carries  over.  One  should  assume  higher  order  differentiability  of  the 
functions  involved.  Lemma  1.1  can  easily  be  extended  to  the  case  where 
higher  order  of  partials  of  v w.r.  to  y appear,  making  the  proof  of  the 
corresponding  Theorem  1.1  possible.  We  can  also  restrict  uL  to  a polynomial 
form  in  terms  of  the  h 's.  The  analog  of  Theorem  1.2  can  be  easily  stated 
and  proven  and  Proposition  l.l  also  carries  over. 
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FinaLLy,  an  N-level  Stackelberg  game  where  on  each  i-Level 

(i  * 1 N)  nt  followers  operate  (u*,...,u*  ),  play  Nash  (or  Pareto)  among 

them,  and  | * u^(h* (x,t) ,t)  j • 1,...,^,  1 - 1,...,N,  with  given  h*  and 
fixed  xq,  tQ,  t^  can  be  easily  treated  by  using  the  analysis  for  the 
nonclassical  control  problem  supplied  here. 
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Appendix  A 

In  this  Appendix  we  give  certain  conditions  under  which  Assumption 
(.A)  (Section  2)  holds. 

Lemma  A. 1 : Let  U t be  a subset  of  U (see  (45)),  defined  as 

X 

U ■ fu6U|u(x,t)  ■ C(t)x+  D(t ) , where  the  m.  x n matrix  C(t) 

X *■  ' i 

and  the  x 1 vector  D(t)  are  piecewise  continuous  (A-l) 

functions  of  time  over  [t^.t^jl. 

Then  it  holds: 


inf  J. (u,v)  at  inf  J. (u,v)  2 inf  J. (u,v)  * inf  J.  (u,v) 

1 (A- 2) 

u€  U > v g Tu  u€U,  v€Tu  u€U,  v€T'u  u€U,,  v€T'u- 
X A 

Proof : The  inequalities  follow  from  the  facts  U ~ U,  Tu^T'u  Vu€u.  The 

X ~ 

last  equality  is  obvious  in  the  light  of  (35)  and  the  proof  of  Theorem 
1.2.  0 
An  immediate  conclusion  of  Lemma  A. 1 is  that  if 


inf  J^(u,v)  ■ inf  J^fu.v) 
u£  U,,  v€  Tu  u€  U.,  v€  T'u 

X At 


(A-3) 


* 

holds,  then  Assumption  (A)  holds  (with  U ■ U).  For  (A-3)  to  hold,  it 
suffices  that  the  first  order  necessary  conditions  for  the  follower's 
problem  are  also  sufficient,  for  each  fixed  u^U,.  More  specifically,  for 
fixed  C(t),  D(t)  as  in  definition  (A-l)  , we  consider  the  problem 
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minimize  h(x(.tf))  + M(x,C(t)x  + D(t)  ,v,  t)dt 


subject  to : v 6 V 


(A-4) 


x*  f (x,C(t)x  + D(t),v,t),  x(to) 


V t€[t0*tf1 


and  seek  conditions  under  which  the  first  order  necessary  conditions  for  an 
* 

optimal  v for  problem  (.A-4)  (see  (51-b)-(51-d))  are  also  sufficient.  Such 
conditions  can  be  found  in  Chapter  5-2  of  [15].  We  formalize  this  discussion 
in  the  following  Proposition. 

Proposition  A.l:  If  for  each  u£U.,  the  first  order  necessary  conditions 

At 

(51-b)- (51-d)  for  problem  (A-4)  are  also  sufficient,  then  Assumption  (A) 
holds. 

The  discussion  in  the  present  Appendix  generalizes  clearly  to  the 
case  where  each  u^  depends  on  hi(x,t)  instead  of  x and  to  the  case  where 
different  U^'s  are  considered;  see  for  example  Proposition  l.l(ii). 

As  an  example  where  Proposition  A.l  can  be  applied,  we  consider 
the  linear  quadratic  game  of  Section  3.  Then,  Theorem  5,  p.  341  and 
Corollary  p.  343  of  [15]  in  conjunction  with  Proposition  A.l  yield  that  if 


Qj  1 0»  a 0 then  Assumption  (A)  holds. 
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Appendix  B 

In  this  Appendix  we  investigate  under  what  conditions  the  Principle 
of  Optimality  holds  for  the  Stackelberg  games  of  Sections  2 and  3. 

We  consider  first  the  linear  quadratic  game  of  Section  3.  As  it 
was  shown  in  Section  2,  \,,(t)  * 0 Y t€  [t^,t^],  is  a necessary  condition  for 
the  principle  of  optimality  to  hold.  With  \ a 0,  (,73)  yields 

-B-.R^R^fC^p  + B,, l - 0 
from  which,  by  assuming  rank  B,  * m^,  we  obtain  equivalently 

-R^R^iBjp  + b;\l  - 0. 

Also,  (71)  yields 

uj  - 0,  i ■ L,...,m.  (B-l) 


We  conclude  that  under  the  assumption  rank  B.,  ■ m0,  (67)-(74)  simplify  to 
give 


e 

x • Ax  + B^u  + B,v 

\ ■ -V  ‘ A'\ 

*u“  - »;xi  ■ »•  *i2v  ■-  Bixi  ■ ° 
“V  ■ "o'  W ■ *uxf 

p “ -Q0x  - A'p 

v - 

4m  *.  *. 


vB-2) 

(B-3) 

(B-4) 

IB-5) 

(B-6) 

(B-7) 
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p(tf)  = K9fxf-  (B-8) 

(B-2)-(B-5)  show  that  the  leader's  problem  can  be  considered  as  a team 

'ic  “tc 

problem  under  the  "constraint"  (B-l),  with  optimal  solution,  say  (u  ,v  ) and 

* 

(B-6)-(B-8)  show  that  the  same  v must  be  the  follower  s optimal  reaction  to 

* 

the  leader  s choice  u . Actually,  (B-l)  is  not  at  all  a constraint,  since 
with  (68), (where  u1  appears)  is  not  really  considered  by  the  leader. 

So,  the  leader  operating  under  (67)  and  wanting  to  minimize  (60)  may  as  well 
choose  u1  * 0,  since  he  is  penalized  for  u , while  u1  does  not  appear  in 

X XX 

(67). 

The  same  analysis  and  conclusions  carry  over  to  the  more  general 

game  of  Section  2 (see  (45)- (49)  and  (54)),  since  the  condition  \ = 0 on 

[t  ,t-]  comes  from  the  demand  that  the  transversality  conditions  hold 
of 

Y t€  [t^t^]  and  *-s  not  affected  by  the  fact  that  in  (48)  u*  is  not  penalized. 
Notica  that  if  the  leader's  cost  functional  (48)  is  substituted  by 

,Cf  ml  . 

J,  (u,v)  = g(x(t„))  + J fL(x,u,v,t)  + £ u^-'r ,u1ldt 
4 1 V i-1  X 1 (B-9) 

> 0 , i = 1 , . . . , m^ 
then  (B-l)  holds  again. 

The  idea  behind  the  condition  \ = 0 on  [t^,t^]  is  that  the  leader 

is  not  really  constrained  by  the  follower's  adjoint  equation  and  therefore 
the  leader's  problem,  being  independent  of  the  follower's  problem,  becomes  a 
team  control  problem. 

In  conclusion,  a necessary  condition  for  the  Principle  of  Optimality 
to  hold  for  the  Stackelberg  games  of  Sections  2 and  3,  is  that  the  leader's 
problem  is  actually  a team  control  problem.  But  for  a control  problem  with 
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fixed  initial  conditions,  the  Principle  of  Optimality  does  hold.  We  thus 
have  the  "if  and  only  if"  statement:  The  Principle  of  Optimality  holds  for 
the  problems  of  Sections  2 and  3 (see  (45)- (49),  (54)  and  (59)- (61) 
respectively)  if  and  only  if  the  leader's  problem  is  a team  control  problem 
for  both  the  leader  and  follower. 
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